A novel approach for comparing web sites by using MicroGenres
نویسندگان
چکیده
In this paper, a novel approach is introduced to compare web sites by analysing their web page content. Each web page can be expressed as a set of entities called MicroGenres, which in turn are abstractions about design patterns and genres for representing the page content. This description is useful for web page and web site classification and for a deeper insight into the web site's social context. The web site comparison is useful for extracting patterns which can be used for improving Web search engine effectiveness, the identification of best practices in web site design and of course in the organization of web page content to personalize the web user experience on a web site. The effectiveness of the proposed approach was tested in a real world case, with e-shop web sites showing that a web site can be represented in a high level of abstraction by using MicroGenres, the contents of which can then be compared and given a measure corresponding to web site similarity. This measure is very useful for detecting web communities on the Web, i.e., a group of web sites sharing similar contents, and the result is essential in performing a focused and effective information search as well as minimizing web page retrieval. & 2014 Elsevier Ltd. All rights reserved.
منابع مشابه
A Novel Approach to Feature Selection Using PageRank algorithm for Web Page Classification
In this paper, a novel filter-based approach is proposed using the PageRank algorithm to select the optimal subset of features as well as to compute their weights for web page classification. To evaluate the proposed approach multiple experiments are performed using accuracy score as the main criterion on four different datasets, namely WebKB, Reuters-R8, Reuters-R52, and 20NewsGroups. By analy...
متن کاملPositioning of Industries in Cyberspace Evaluation of Web Sites Using Correspondence Analysis
In today’s extremely competitive markets it is crucial for companies to strategically position their brands, products and services relative to their competitors. With the emerging trend in internationalization of companies especially SME’s and the growing use of the Internet with this regard, great amount of attention has been turned to effective involvement of the Internet channel in the mar...
متن کاملTowards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملFormation interface detection using Gamma Ray log: A novel approach
There are two methods for identifying formation interface in oil wells: core analysis, which is a precise approach but costly and time consuming, and well logs analysis, which petrophysists perform, which is subjective and not completely reliable. In this paper, a novel coupled method was proposed to detect the formation interfaces using GR logs. Second approximation level (a2) of GR log gained...
متن کاملExpert Discovery: A web mining approach
Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Eng. Appl. of AI
دوره 35 شماره
صفحات -
تاریخ انتشار 2014